Medellín
Decoding street network morphologies and their correlation to travel mode choice
Riascos-Goyes, Juan Fernando, Lowry, Michael, Guarín-Zapata, Nicolás, Ospina, Juan P.
Urban morphology has long been recognized as a factor shaping human mobility, yet comparative and formal classifications of urban form across metropolitan areas remain limited. Building on theoretical principles of urban structure and advances in unsupervised learning, we systematically classified the built environment of nine U.S. metropolitan areas using structural indicators such as density, connectivity, and spatial configuration. The resulting morphological types were linked to mobility patterns through descriptive statistics, marginal effects estimation, and post hoc statistical testing. Here we show that distinct urban forms are systematically associated with different mobility behaviors, such as reticular morphologies being linked to significantly higher public transport use (marginal effect = 0.49) and reduced car dependence (-0.41), while organic forms are associated with increased car usage (0.44), and substantial declines in public transport (-0.47) and active mobility (-0.30). These effects are statistically robust (p < 1e-19), highlighting that the spatial configuration of urban areas plays a fundamental role in shaping transportation choices. Our findings extend previous work by offering a reproducible framework for classifying urban form and demonstrate the added value of morphological analysis in comparative urban research. These results suggest that urban form should be treated as a key variable in mobility planning and provide empirical support for incorporating spatial typologies into sustainable urban policy design.
- North America > United States > New York > New York County > New York City (0.14)
- North America > United States > Massachusetts > Suffolk County > Boston (0.14)
- North America > United States > North Carolina > Wake County > Cary (0.14)
- (19 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Transportation > Infrastructure & Services (1.00)
- Transportation > Ground > Road (1.00)
Embedding-Aware Quantum-Classical SVMs for Scalable Quantum Machine Learning
Ordóñez, Sebastián Andrés Cajas, Torres, Luis Fernando Torres, Bifulco, Mario, Durán, Carlos Andrés, Bosch, Cristian, Carbajo, Ricardo Simón
Quantum Support Vector Machines face scalability challenges due to high-dimensional quantum states and hardware limitations. We propose an embedding-aware quantum-classical pipeline combining class-balanced k-means distillation with pretrained Vision Transformer embeddings. Our key finding: ViT embeddings uniquely enable quantum advantage, achieving up to 8.02% accuracy improvements over classical SVMs on Fashion-MNIST and 4.42% on MNIST, while CNN features show performance degradation. Using 16-qubit tensor network simulation via cuTensorNet, we provide the first systematic evidence that quantum kernel advantage depends critically on embedding choice, revealing fundamental synergy between transformer attention and quantum feature spaces. This provides a practical pathway for scalable quantum machine learning that leverages modern neural architectures.
- South America > Colombia > Cauca Department > Popayán (0.04)
- South America > Colombia > Antioquia Department > Medellín (0.04)
- Europe > Italy > Piedmont > Turin Province > Turin (0.04)
- Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
- Information Technology (0.68)
- Health & Medicine > Therapeutic Area (0.46)
Enhancing software product lines with machine learning components
Cobaleda, Luz-Viviana, Carvajal, Julián, Vallejo, Paola, López, Andrés, Mazo, Raúl
Modern software systems increasingly integrate machine learning (ML) due to its advancements and ability to enhance data-driven decision-making. However, this integration introduces significant challenges for software engineering, especially in software product lines (SPLs), where managing variability and reuse becomes more complex with the inclusion of ML components. Although existing approaches have addressed variability management in SPLs and the integration of ML components in isolated systems, few have explored the intersection of both domains. Specifically, there is limited support for modeling and managing variability in SPLs that incorporate ML components. To bridge this gap, this article proposes a structured framework designed to extend Software Product Line engineering, facilitating the integration of ML components. It facilitates the design of SPLs with ML capabilities by enabling systematic modeling of variability and reuse. The proposal has been partially implemented with the VariaMos tool.
- North America > United States > New York > New York County > New York City (0.04)
- South America > Colombia > Antioquia Department > Medellín (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- (3 more...)
- Education > Educational Setting (0.67)
- Information Technology > Security & Privacy (0.46)
Cultivating Pluralism In Algorithmic Monoculture: The Community Alignment Dataset
Zhang, Lily Hong, Milli, Smitha, Jusko, Karen, Smith, Jonathan, Amos, Brandon, Bouaziz, Wassim, Revel, Manon, Kussman, Jack, Sheynin, Yasha, Titus, Lisa, Radharapu, Bhaktipriya, Yu, Jane, Sarma, Vidya, Rose, Kris, Nickel, Maximilian
How can large language models (LLMs) serve users with varying preferences that may conflict across cultural, political, or other dimensions? To advance this challenge, this paper establishes four key results. First, we demonstrate, through a large-scale multilingual human study with representative samples from five countries (N=15,000), that humans exhibit significantly more variation in preferences than the responses of 21 state-of-the-art LLMs. Second, we show that existing methods for preference dataset collection are insufficient for learning the diversity of human preferences even along two of the most salient dimensions of variability in global values, due to the underlying homogeneity of candidate responses. Third, we argue that this motivates the need for negatively-correlated sampling when generating candidate sets, and we show that simple prompt-based techniques for doing so significantly enhance the performance of alignment methods in learning heterogeneous preferences. Fourth, based on this novel candidate sampling approach, we collect and open-source Community Alignment, the largest and most representative multilingual and multi-turn preference dataset to date, featuring almost 200,000 comparisons from annotators spanning five countries. We hope that the Community Alignment dataset will be a valuable resource for improving the effectiveness of LLMs for a diverse global population.
- Europe > Austria > Vienna (0.13)
- Asia > India (0.04)
- South America > Brazil (0.04)
- (24 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Overview (0.92)
- (4 more...)
- Water & Waste Management > Water Management > Water Supplies & Services (1.00)
- Media > Music (1.00)
- Materials > Chemicals > Agricultural Chemicals (1.00)
- (17 more...)
CF-RAG: A Dataset and Method for Carbon Footprint QA Using Retrieval-Augmented Generation
Zhao, Kaiwen, Balaji, Bharathan, Lee, Stephen
Product sustainability reports provide valuable insights into the environmental impacts of a product and are often distributed in PDF format. These reports often include a combination of tables and text, which complicates their analysis. The lack of standardization and the variability in reporting formats further exacerbate the difficulty of extracting and interpreting relevant information from large volumes of documents. In this paper, we tackle the challenge of answering questions related to carbon footprints within sustainability reports available in PDF format. Unlike previous approaches, our focus is on addressing the difficulties posed by the unstructured and inconsistent nature of text extracted from PDF parsing. To facilitate this analysis, we introduce CarbonPDF-QA, an open-source dataset containing question-answer pairs for 1735 product report documents, along with human-annotated answers. Our analysis shows that GPT-4o struggles to answer questions with data inconsistencies. To address this limitation, we propose CarbonPDF, an LLM-based technique specifically designed to answer carbon footprint questions on such datasets. We develop CarbonPDF by fine-tuning Llama 3 with our training data. Our results show that our technique outperforms current state-of-the-art techniques, including question-answering (QA) systems finetuned on table and text data.
- North America > United States > District of Columbia > Washington (0.05)
- South America > Colombia > Antioquia Department > Medellín (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Public Relations > Community Relations (0.55)
- Research Report > New Finding (0.54)
- Social Sector (0.55)
- Law (0.34)
A Framework for Multi-source Privacy Preserving Epidemic Analysis
Guan, Zihan, Zhao, Zhiyuan, Tian, Fengwei, Nguyen, Dung, Bhattacharjee, Payel, Tandon, Ravi, Prakash, B. Aditya, Vullikanti, Anil
It is now well understood that diverse datasets provide a lot of value in key epidemiology and public health analyses, such as forecasting and nowcasting, development of epidemic models, evaluation and design of interventions and resource allocation. Some of these datasets are often sensitive, and need adequate privacy protections. There are many models of privacy, but Differential Privacy (DP) has become a de facto standard because of its strong guarantees, without making models about adversaries. In this paper, we develop a framework the integrates deep learning and epidemic models to simultaneously perform epidemic forecasting and learning a mechanistic model of epidemic spread, while incorporating multiple datasets for these analyses, including some with DP guarantees. We demonstrate our framework using a realistic but synthetic financial dataset with DP; such a dataset has not been used in such epidemic analyses. We show that this dataset provides significant value in forecasting and learning an epidemic model, even when used with DP guarantees.
- North America > United States > Virginia > Albemarle County > Charlottesville (0.14)
- North America > United States > Arizona > Pima County > Tucson (0.14)
- South America > Colombia > Bogotá D.C. > Bogotá (0.04)
- (8 more...)
- Information Technology > Security & Privacy (1.00)
- Information Technology > Data Science > Data Mining (1.00)
- Information Technology > Communications (1.00)
- (3 more...)
REMoH: A Reflective Evolution of Multi-objective Heuristics approach via Large Language Models
Forniés-Tabuenca, Diego, Uribe, Alejandro, Otamendi, Urtzi, Artetxe, Arkaitz, Rivera, Juan Carlos, de Lacalle, Oier Lopez
Multi-objective optimization is fundamental in complex decision-making tasks. Traditional algorithms, while effective, often demand extensive problem-specific modeling and struggle to adapt to nonlinear structures. Recent advances in Large Language Models (LLMs) offer enhanced explainability, adaptability, and reasoning. This work proposes Reflective Evolution of Multi-objective Heuristics (REMoH), a novel framework integrating NSGA-II with LLM-based heuristic generation. A key innovation is a reflection mechanism that uses clustering and search-space reflection to guide the creation of diverse, high-quality heuristics, improving convergence and maintaining solution diversity. The approach is evaluated on the Flexible Job Shop Scheduling Problem (FJSSP) in-depth benchmarking against state-of-the-art methods using three instance datasets: Dauzere, Barnes, and Brandimarte. Results demonstrate that REMoH achieves competitive results compared to state-of-the-art approaches with reduced modeling effort and enhanced adaptability. These findings underscore the potential of LLMs to augment traditional optimization, offering greater flexibility, interpretability, and robustness in multi-objective scenarios.
- Europe > Spain > Basque Country (0.04)
- Asia > Singapore (0.04)
- South America > Uruguay > Maldonado > Maldonado (0.04)
- (4 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
- (3 more...)
Humanizing LLMs: A Survey of Psychological Measurements with Tools, Datasets, and Human-Agent Applications
Dong, Wenhan, Zhao, Yuemeng, Sun, Zhen, Liu, Yule, Peng, Zifan, Zheng, Jingyi, Zhang, Zongmin, Zhang, Ziyi, Wu, Jun, Wang, Ruiming, Xu, Shengmin, Huang, Xinyi, He, Xinlei
As large language models (LLMs) are increasingly used in human-centered tasks, assessing their psychological traits is crucial for understanding their social impact and ensuring trustworthy AI alignment. While existing reviews have covered some aspects of related research, several important areas have not been systematically discussed, including detailed discussions of diverse psychological tests, LLM-specific psychological datasets, and the applications of LLMs with psychological traits. To address this gap, we systematically review six key dimensions of applying psychological theories to LLMs: (1) assessment tools; (2) LLM-specific datasets; (3) evaluation metrics (consistency and stability); (4) empirical findings; (5) personality simulation methods; and (6) LLM-based behavior simulation. Our analysis highlights both the strengths and limitations of current methods. While some LLMs exhibit reproducible personality patterns under specific prompting schemes, significant variability remains across tasks and settings. Recognizing methodological challenges such as mismatches between psychological tools and LLMs' capabilities, as well as inconsistencies in evaluation practices, this study aims to propose future directions for developing more interpretable, robust, and generalizable psychological assessment frameworks for LLMs.
- North America > United States > Florida > Miami-Dade County > Miami (0.14)
- Europe > Austria > Vienna (0.14)
- North America > United States > Connecticut > New Haven County > New Haven (0.04)
- (11 more...)
- Research Report > New Finding (1.00)
- Overview (1.00)
- Leisure & Entertainment > Games (1.00)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.68)
- Education > Educational Setting (0.67)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.67)
Differential privacy enables fair and accurate AI-based analysis of speech disorders while protecting patient data
Arasteh, Soroosh Tayebi, Lotfinia, Mahshad, Perez-Toro, Paula Andrea, Arias-Vergara, Tomas, Ranji, Mahtab, Orozco-Arroyave, Juan Rafael, Schuster, Maria, Maier, Andreas, Yang, Seung Hee
Speech pathology has impacts on communication abilities and quality of life. While deep learning-based models have shown potential in diagnosing these disorders, the use of sensitive data raises critical privacy concerns. Although differential privacy (DP) has been explored in the medical imaging domain, its application in pathological speech analysis remains largely unexplored despite the equally critical privacy concerns. This study is the first to investigate DP's impact on pathological speech data, focusing on the trade-offs between privacy, diagnostic accuracy, and fairness. Using a large, real-world dataset of 200 hours of recordings from 2,839 German-speaking participants, we observed a maximum accuracy reduction of 3.85% when training with DP with high privacy levels. To highlight real-world privacy risks, we demonstrated the vulnerability of non-private models to explicit gradient inversion attacks, reconstructing identifiable speech samples and showcasing DP's effectiveness in mitigating these risks. To generalize our findings across languages and disorders, we validated our approach on a dataset of Spanish-speaking Parkinson's disease patients, leveraging pretrained models from healthy English-speaking datasets, and demonstrated that careful pretraining on large-scale task-specific datasets can maintain favorable accuracy under DP constraints. A comprehensive fairness analysis revealed minimal gender bias at reasonable privacy levels but underscored the need for addressing age-related disparities. Our results establish that DP can balance privacy and utility in speech disorder detection, while highlighting unique challenges in privacy-fairness trade-offs for speech data. This provides a foundation for refining DP methodologies and improving fairness across diverse patient groups in real-world deployments.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Europe > Austria > Vienna (0.14)
- Europe > Germany > Bavaria > Middle Franconia > Nuremberg (0.04)
- (19 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine > Diagnostic Medicine (1.00)
- Health & Medicine > Therapeutic Area > Neurology > Parkinson's Disease (0.86)
- Health & Medicine > Therapeutic Area > Musculoskeletal (0.86)
CATALOG: A Camera Trap Language-guided Contrastive Learning Model
Santamaria, Julian D., Isaza, Claudia, Giraldo, Jhony H.
Foundation Models (FMs) have been successful in various computer vision tasks like image classification, object detection and image segmentation. However, these tasks remain challenging when these models are tested on datasets with different distributions from the training dataset, a problem known as domain shift. This is especially problematic for recognizing animal species in camera-trap images where we have variability in factors like lighting, camouflage and occlusions. In this paper, we propose the Camera Trap Language-guided Contrastive Learning (CATALOG) model to address these issues. Our approach combines multiple FMs to extract visual and textual features from camera-trap data and uses a contrastive loss function to train the model. We evaluate CATALOG on two benchmark datasets and show that it outperforms previous state-of-the-art methods in camera-trap image recognition, especially when the training and testing data have different animal species or come from different geographical areas. Our approach demonstrates the potential of using FMs in combination with multi-modal fusion and contrastive learning for addressing domain shifts in camera-trap image recognition. The code of CATALOG is publicly available at https://github.com/Julian075/CATALOG.
- South America > Colombia > Antioquia Department > Medellín (0.04)
- Europe > France (0.04)
- Africa (0.04)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)